378 feature request make sure that results obtained from different benchmark versions are not compared#504
Conversation
…esults-obtained-from-different-benchmark-versions-are-not-compared
…re-that-results-obtained-from-different-benchmark-versions-are-not-compared
WalkthroughThe PR lazy-loads GlobalStatus in JadeApp, moves raw-results and simulations parsing to cached properties, adds benchmark-version validation for requested libraries (raising VersionInconsistencyError on mismatch), refactors simulation success checkers into classes, updates run/post-processing to use the new checkers and validation, and adapts tests accordingly. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant App as JadeApp
participant Status as GlobalStatus
participant Validator as _validate_libs_processing
participant Post as Excel/Atlas Processing
User->>App: post_process(codelibs, benchmark)
App->>Status: access status (lazy-load if _status is None)
Status-->>App: GlobalStatus instance
App->>Status: validate_codelibs(codelibs, benchmark)
Status->>Validator: _validate_libs_processing(code, benchmark, libs)
alt Version mismatch
Validator-->>Status: VersionInconsistencyError
Status-->>App: raise error
App-->>User: post_process fails
else Versions consistent
Validator-->>Status: OK
Status-->>App: validation passed
App->>Post: create outputs & run Excel/Atlas processing
Post-->>App: processing complete
App-->>User: post_process succeeds
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
tests/config/test_status.py (1)
52-60: Exercise the publicvalidate_codelibs()path in this test.
JadeApp.post_process()calls the public wrapper with a[(CODE, lib), ...]list. Going straight to_validate_libs_processing()bypasses that grouping/dispatch layer, so this can miss bugs in the real production entry point and will break on harmless internal refactors.♻️ Suggested test shape
- with pytest.raises(VersionInconsistencyError): - status._validate_libs_processing( - CODE.MCNP, "Oktavian", ["FENDL 3.2c", "ENDFB-VIII.0"] - ) - status._validate_libs_processing( - CODE.MCNP, "Sphere", ["FENDL 3.2c", "ENDFB-VIII.0"] - ) + with pytest.raises(VersionInconsistencyError): + status.validate_codelibs( + [(CODE.MCNP, "FENDL 3.2c"), (CODE.MCNP, "ENDFB-VIII.0")], + "Oktavian", + ) + status.validate_codelibs( + [(CODE.MCNP, "FENDL 3.2c"), (CODE.MCNP, "ENDFB-VIII.0")], + "Sphere", + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/config/test_status.py` around lines 52 - 60, The test should exercise the public entrypoint validate_codelibs instead of calling the private helper _validate_libs_processing directly; replace the two calls to status._validate_libs_processing(...) with calls that pass a list of (CODE, lib) pairs to status.validate_codelibs([(CODE.MCNP, "Oktavian")]) wrapped in pytest.raises(VersionInconsistencyError) for the failing case, and status.validate_codelibs([(CODE.MCNP, "Sphere")]) for the succeeding case so the test follows the same grouping/dispatch path used by JadeApp.post_process().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/jade/app/app.py`:
- Around line 58-69: The cached GlobalStatus returned by JadeApp.status can
become stale after write operations; update JadeApp so that methods that mutate
the simulations/raw trees (e.g., run_benchmarks, raw_process, and any other
writer methods) invalidate the cache by setting self._status = None after
completing their write work, and ensure any helper functions that modify the
tree do the same; locate the status property and add cache-clearing calls in the
implementations of run_benchmarks and raw_process (and post_process if it
mutates data) so subsequent accesses to JadeApp.status construct a fresh
GlobalStatus.
In `@src/jade/config/status.py`:
- Around line 52-55: The raw-data discovery currently requires metadata.json and
lets missing or malformed metadata raise KeyError later; update the logic in the
raw_data getter and in _parse_raw_results_folder to tolerate absent/malformed
metadata by returning None or an empty dict for self._raw_metadata when
metadata.json is missing or invalid, and change _validate_libs_processing to
check for the presence of keys like "benchmark_version" via .get or explicit
presence checks on self._raw_metadata before indexing so it raises a clear
ConsistencyError (or similar) with a descriptive message instead of a raw
KeyError; ensure raw_results_path, _raw_metadata, _raw_data,
_parse_raw_results_folder, and _validate_libs_processing are the symbols updated
so availability queries succeed and validation fails cleanly.
- Around line 46-48: The cached attribute self._simulations (accessed by the
simulations property) is never refreshed after first access, so methods like
was_simulated() and get_successful_simulations() can become stale; modify the
simulations property (and the similar block at lines ~66-72) to add an explicit
refresh path (e.g., an optional force_refresh flag or refresh method) that calls
self._parse_simulations_folder(self.simulations_path) to repopulate
self._simulations when update_raw_results() runs or when callers request a fresh
scan; update GlobalStatus to call that refresh (or pass force_refresh=True)
before was_simulated() / get_successful_simulations() use the cache so the
folder is rescanned for new/removed simulations.
In `@src/jade/helper/aux_functions.py`:
- Around line 72-126: The checkers lost run-directory context because
SimulationChecker.check_success(files: list[str]) only receives filenames;
change the abstract signature to check_success(files: list[str], run_dir: str |
Path) -> bool (or add an optional run_dir: str | Path = None) on
SimulationChecker and update all concrete implementations
(MCNPChecker.check_success, OpenMCChecker.check_success,
SerpentChecker.check_success, D1SChecker.check_success) to accept and use
run_dir when resolving/validating outputs (keeping existing behavior if run_dir
is None for backward compatibility); then update callers (e.g., the code paths
in src/jade/config/status.py and src/jade/run/benchmark.py) to pass the run
directory into check_success and update docstrings/type hints accordingly.
---
Nitpick comments:
In `@tests/config/test_status.py`:
- Around line 52-60: The test should exercise the public entrypoint
validate_codelibs instead of calling the private helper
_validate_libs_processing directly; replace the two calls to
status._validate_libs_processing(...) with calls that pass a list of (CODE, lib)
pairs to status.validate_codelibs([(CODE.MCNP, "Oktavian")]) wrapped in
pytest.raises(VersionInconsistencyError) for the failing case, and
status.validate_codelibs([(CODE.MCNP, "Sphere")]) for the succeeding case so the
test follows the same grouping/dispatch path used by JadeApp.post_process().
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: cf6c71ca-11de-4756-b1b1-8a13ced0ecf1
📒 Files selected for processing (9)
src/jade/app/app.pysrc/jade/config/status.pysrc/jade/helper/aux_functions.pysrc/jade/helper/errors.pysrc/jade/post/sim_output.pysrc/jade/run/benchmark.pytests/app/test_app.pytests/config/test_status.pytests/dummy_structure/raw_data/_mcnp_-_ENDFB-VIII.0_/Oktavian/metadata.json
| self._status = None | ||
|
|
||
| @property | ||
| def status(self) -> GlobalStatus: | ||
| """Lazy-load the global status on first access.""" | ||
| if self._status is None: | ||
| logging.info("Initializing the global status (lazy-loaded)") | ||
| self._status = GlobalStatus( | ||
| simulations_path=self.tree.simulations, | ||
| raw_results_path=self.tree.raw, | ||
| ) | ||
| return self._status |
There was a problem hiding this comment.
Invalidate the cached GlobalStatus after write operations.
This property memoizes one GlobalStatus snapshot for the lifetime of JadeApp. After run_benchmarks() or raw_process() changes the simulations/raw trees, later calls still consult stale simulations / raw_data unless callers refresh manually — tests/app/test_app.py Lines 63-72 now has to do that explicitly. A same-session raw_process() → post_process() flow can therefore miss the raw results it just produced.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/jade/app/app.py` around lines 58 - 69, The cached GlobalStatus returned
by JadeApp.status can become stale after write operations; update JadeApp so
that methods that mutate the simulations/raw trees (e.g., run_benchmarks,
raw_process, and any other writer methods) invalidate the cache by setting
self._status = None after completing their write work, and ensure any helper
functions that modify the tree do the same; locate the status property and add
cache-clearing calls in the implementations of run_benchmarks and raw_process
(and post_process if it mutates data) so subsequent accesses to JadeApp.status
construct a fresh GlobalStatus.
| if self._simulations is None: | ||
| self._simulations = self._parse_simulations_folder(self.simulations_path) | ||
| return self._simulations |
There was a problem hiding this comment.
Add a refresh path for cached simulation state.
After the first simulations access, this instance never rescans simulations_path. update_raw_results() only refreshes raw outputs, so was_simulated() and get_successful_simulations() can stay stale for the lifetime of a long-lived GlobalStatus.
Possible fix
+ def update_simulations(self) -> None:
+ """Update the simulations by re-parsing the simulations folder."""
+ self._simulations = self._parse_simulations_folder(self.simulations_path)
+
def update_raw_results(self) -> None:
"""Update the raw results by re-parsing the raw results folder. It should be used
after processing new raw results to update the status.
"""
self._raw_data, self._raw_metadata = self._parse_raw_results_folder(
self.raw_results_path
)Also applies to: 66-72
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/jade/config/status.py` around lines 46 - 48, The cached attribute
self._simulations (accessed by the simulations property) is never refreshed
after first access, so methods like was_simulated() and
get_successful_simulations() can become stale; modify the simulations property
(and the similar block at lines ~66-72) to add an explicit refresh path (e.g.,
an optional force_refresh flag or refresh method) that calls
self._parse_simulations_folder(self.simulations_path) to repopulate
self._simulations when update_raw_results() runs or when callers request a fresh
scan; update GlobalStatus to call that refresh (or pass force_refresh=True)
before was_simulated() / get_successful_simulations() use the cache so the
folder is rescanned for new/removed simulations.
| if self._raw_data is None: | ||
| self._raw_data, self._raw_metadata = self._parse_raw_results_folder( | ||
| self.raw_results_path | ||
| ) |
There was a problem hiding this comment.
Make raw-data discovery tolerant to missing metadata, then fail cleanly during validation.
raw_data now depends on metadata.json being present for every non-exp benchmark, so folders without that file will break simple availability queries. _validate_libs_processing() then indexes "benchmark_version" directly, which turns malformed metadata into a raw KeyError instead of a clear consistency error.
Possible fix
- available_raw_data[(CODE(code), lib, benchmark)] = os.listdir(
- bench_path
- )
+ available_raw_data[(CODE(code), lib, benchmark)] = [
+ entry for entry in os.listdir(bench_path)
+ if entry != "metadata.json"
+ ]
if lib != "exp": # for the experiments we don't have metadata
- with open(os.path.join(bench_path, "metadata.json")) as infile:
- metadata[(CODE(code), lib, benchmark)] = json.load(infile)
+ metadata_file = Path(bench_path, "metadata.json")
+ if metadata_file.is_file():
+ with metadata_file.open() as infile:
+ metadata[(CODE(code), lib, benchmark)] = json.load(infile)
...
versions = {}
for lib in libs:
- versions[lib] = self.raw_metadata[(code, lib, benchmark)][
- "benchmark_version"
- ]
+ version = self.raw_metadata.get((code, lib, benchmark), {}).get(
+ "benchmark_version"
+ )
+ if version is None:
+ raise VersionInconsistencyError(
+ f"Missing benchmark_version metadata for {code} / {lib} / {benchmark}"
+ )
+ versions[lib] = versionAlso applies to: 60-63, 144-146, 292-294
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/jade/config/status.py` around lines 52 - 55, The raw-data discovery
currently requires metadata.json and lets missing or malformed metadata raise
KeyError later; update the logic in the raw_data getter and in
_parse_raw_results_folder to tolerate absent/malformed metadata by returning
None or an empty dict for self._raw_metadata when metadata.json is missing or
invalid, and change _validate_libs_processing to check for the presence of keys
like "benchmark_version" via .get or explicit presence checks on
self._raw_metadata before indexing so it raises a clear ConsistencyError (or
similar) with a descriptive message instead of a raw KeyError; ensure
raw_results_path, _raw_metadata, _raw_data, _parse_raw_results_folder, and
_validate_libs_processing are the symbols updated so availability queries
succeed and validation fails cleanly.
| class SimulationChecker(ABC): | ||
| """Abstract base class for simulation success checkers. | ||
|
|
||
| All code-specific checkers must inherit from this class and implement | ||
| the check_success method with the same signature. | ||
| """ | ||
|
|
||
| def check_run_openmc(folder: PathLike) -> bool: | ||
| """check if openmc run was successful""" | ||
| try: | ||
| OpenMCSimOutput.retrieve_file(folder) | ||
| return True | ||
| except FileNotFoundError: | ||
| return False | ||
| @abstractmethod | ||
| def check_success(self, files: list[str]) -> bool: | ||
| """Check if a simulation run was successful. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| files : list[str] | ||
| List of output files to check for success. | ||
|
|
||
| Returns | ||
| ------- | ||
| bool | ||
| True if the simulation completed successfully, False otherwise. | ||
| """ | ||
| pass | ||
|
|
||
|
|
||
| class MCNPChecker(SimulationChecker): | ||
| """Checker for MCNP simulations.""" | ||
|
|
||
| def check_success(self, files: list[str]) -> bool: | ||
| """Check if MCNP run was successful by verifying output files exist.""" | ||
| return MCNPSimOutput.is_successfully_simulated(files) | ||
|
|
||
|
|
||
| class OpenMCChecker(SimulationChecker): | ||
| """Checker for OpenMC simulations.""" | ||
|
|
||
| def check_success(self, files: list[str]) -> bool: | ||
| """Check if OpenMC run was successful by verifying output files exist.""" | ||
| return OpenMCSimOutput.is_successfully_simulated(files) | ||
|
|
||
|
|
||
| class SerpentChecker(SimulationChecker): | ||
| """Checker for Serpent simulations.""" | ||
|
|
||
| def check_success(self, files: list[str]) -> bool: | ||
| """Check if Serpent run was successful.""" | ||
| # TODO implement the logic to check if the Serpent run was successful | ||
| raise NotImplementedError("Serpent checker not yet implemented") | ||
|
|
||
| def check_run_serpent(folder: PathLike) -> bool: | ||
| # TODO implement the logic to check if the Serpent run was successful | ||
| raise NotImplementedError | ||
|
|
||
| class D1SChecker(SimulationChecker): | ||
| """Checker for D1S simulations.""" | ||
|
|
||
| def check_run_d1s(folder: PathLike) -> bool: | ||
| """check if d1s run was successful""" | ||
| return check_run_mcnp(folder) | ||
| def check_success(self, files: list[str]) -> bool: | ||
| """Check if D1S run was successful (uses same logic as MCNP).""" | ||
| return MCNPChecker().check_success(files) |
There was a problem hiding this comment.
Keep the run directory in check_success().
The new files: list[str] contract strips the context needed to validate actual completion. As used by src/jade/config/status.py Lines 94-98 and src/jade/run/benchmark.py Lines 441-445, these checkers now have to infer success from filename presence alone, which can mark interrupted runs with leftover artifacts as complete and then skip reruns or post-process bad outputs.
💡 Suggested interface direction
class SimulationChecker(ABC):
"""Abstract base class for simulation success checkers.
@@
`@abstractmethod`
- def check_success(self, files: list[str]) -> bool:
+ def check_success(
+ self, run_dir: PathLike, files: list[str] | None = None
+ ) -> bool:
"""Check if a simulation run was successful.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/jade/helper/aux_functions.py` around lines 72 - 126, The checkers lost
run-directory context because SimulationChecker.check_success(files: list[str])
only receives filenames; change the abstract signature to check_success(files:
list[str], run_dir: str | Path) -> bool (or add an optional run_dir: str | Path
= None) on SimulationChecker and update all concrete implementations
(MCNPChecker.check_success, OpenMCChecker.check_success,
SerpentChecker.check_success, D1SChecker.check_success) to accept and use
run_dir when resolving/validating outputs (keeping existing behavior if run_dir
is None for backward compatibility); then update callers (e.g., the code paths
in src/jade/config/status.py and src/jade/run/benchmark.py) to pass the run
directory into check_success and update docstrings/type hints accordingly.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/jade/config/status.py (1)
314-321: RedundantCODEconversion.Line 316 converts
codetoCODE(code), but the type hint on line 302 declarescodelibs: list[tuple[CODE, str]], indicatingcodeshould already be aCODEenum. If callers may pass strings, consider updating the type hint; otherwise, remove the redundant conversion.♻️ Proposed simplification (if type hint is accurate)
for code, lib in codelibs: - code = CODE(code) if lib == "exp": # for the experiments we don't have metadata continue if code not in libs:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/jade/config/status.py` around lines 314 - 321, The loop is redundantly converting code to CODE(code) even though codelibs is typed as list[tuple[CODE, str]]; remove the unnecessary conversion in the for loop (i.e., stop reassigning code = CODE(code)) so the existing CODE instance from codelibs is used directly, or if callers may pass strings instead of CODE update the codelibs type hint and add validation/conversion at the call sites; target the for loop that iterates over codelibs and the use of CODE to decide which path to take.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/jade/config/status.py`:
- Around line 301-312: The validate_codelibs method's parameter is named
benchmarks but its type hint and docstring describe a single benchmark; rename
the parameter to benchmark to match the docstring and type hint and update any
internal uses and callers accordingly (search for validate_codelibs and change
the parameter name in its signature and all invocations to benchmark) so
documentation and code are consistent.
- Around line 285-288: The docstring is incorrect: update the docstring for the
function that raises VersionInconsistencyError (the function documented in this
block) to state that it returns None on success and to include a Raises section
documenting VersionInconsistencyError instead of claiming it returns bool;
specifically change the Returns section to "None" (or remove it) and add a
"Raises\n-------\nVersionInconsistencyError: when post-processing cannot be
performed" so the docstring matches the actual behavior.
---
Nitpick comments:
In `@src/jade/config/status.py`:
- Around line 314-321: The loop is redundantly converting code to CODE(code)
even though codelibs is typed as list[tuple[CODE, str]]; remove the unnecessary
conversion in the for loop (i.e., stop reassigning code = CODE(code)) so the
existing CODE instance from codelibs is used directly, or if callers may pass
strings instead of CODE update the codelibs type hint and add
validation/conversion at the call sites; target the for loop that iterates over
codelibs and the use of CODE to decide which path to take.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 61633b82-8572-41db-b231-8cba6472f1fe
📒 Files selected for processing (1)
src/jade/config/status.py
| Returns | ||
| ------- | ||
| bool | ||
| True if the post-processing can be performed, False otherwise. |
There was a problem hiding this comment.
Docstring incorrectly states return type.
The docstring states "Returns -> bool: True if the post-processing can be performed" but the method returns None or raises VersionInconsistencyError. Update the docstring to reflect the actual behavior.
📝 Proposed docstring fix
- Returns
- -------
- bool
- True if the post-processing can be performed, False otherwise.
+ Raises
+ ------
+ VersionInconsistencyError
+ If the benchmark versions are not consistent across the requested libraries.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/jade/config/status.py` around lines 285 - 288, The docstring is
incorrect: update the docstring for the function that raises
VersionInconsistencyError (the function documented in this block) to state that
it returns None on success and to include a Raises section documenting
VersionInconsistencyError instead of claiming it returns bool; specifically
change the Returns section to "None" (or remove it) and add a
"Raises\n-------\nVersionInconsistencyError: when post-processing cannot be
performed" so the docstring matches the actual behavior.
| def validate_codelibs( | ||
| self, codelibs: list[tuple[CODE, str]], benchmarks: str | ||
| ) -> None: | ||
| """Check that a list of codelibs can be post-processed together for a given benchmark. | ||
| This is true if the benchmark version for the requested libs are the same. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| codelibs : list[tuple[CODE, str]] | ||
| list of codelibs to check. Each codelib is a tuple with the code and library. | ||
| benchmark : str | ||
| benchmark name. |
There was a problem hiding this comment.
Parameter name and docstring inconsistency.
The parameter is named benchmarks (line 302) but the docstring references benchmark (line 311-312). Since the type hint indicates str (singular benchmark), consider renaming the parameter to benchmark for consistency.
📝 Proposed fix
def validate_codelibs(
- self, codelibs: list[tuple[CODE, str]], benchmarks: str
+ self, codelibs: list[tuple[CODE, str]], benchmark: str
) -> None:
"""Check that a list of codelibs can be post-processed together for a given benchmark.
...
for code, lib_list in libs.items():
- self._validate_libs_processing(code, benchmarks, lib_list)
+ self._validate_libs_processing(code, benchmark, lib_list)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/jade/config/status.py` around lines 301 - 312, The validate_codelibs
method's parameter is named benchmarks but its type hint and docstring describe
a single benchmark; rename the parameter to benchmark to match the docstring and
type hint and update any internal uses and callers accordingly (search for
validate_codelibs and change the parameter name in its signature and all
invocations to benchmark) so documentation and code are consistent.
Fix #378
At last metadata are used to ensure that during post-processing of a specific benchmark (and code) the libraries results requested were all obtained from the same benchmark version.
To be reviewed after #500.
Summary by CodeRabbit
New Features
Refactor